Original Data and Code Access

Results in this file are upon expanding the universe to include NYSE, NASDAQ and AMEX (per the AE comments). And here you can access the results with only the NYSE universe.

Size-Operating profitability portfolios are formed based on the NYSE, NASDAQ and AMEX.

Further improvement: + Change the risk-free rate data to the ones compatible with different frequency requirements.

Data cleaning and Preparation

Information about initial coding setup:

  1. freq sets the data frequency for the following analysis, 12 for monthly data, 4 for quarterly data and 1 for annual data.
  2. start.ym gives the earliest reasonable starting point of the series, which is January 1966, based on the available number of firms in the data set.
  3. After the preliminary data cleaning, port_market is the market portfolio data (including NYSE, NASDAQ and AMEX), ports_all contains different deciles. All the data are stored in the file named as market.names and data.names. I’ve finished creating the measures based on characteristic deciles, so I’ll have a close look at your results shortly. The decile data is attached. As mentioned, these are based on a single characteristic sort, which will hopefully provide new insight into characteristic based predictability. The characteristics are as follows:
  1. RF denotes the risk-free rate, which is the average of the bid and ask.
# 0. record datasets ----
## 0.1 initial value setup ----
freq = 12 # the frequency of the data <- 12 for monthly; 4 for quarterly; 1 for annually
start.ym = as.yearmon(1966) # the starting time

month_select <- function(freq = freq) { # return the months for differnt time frequency
  if (freq == 12) {
    return(c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November","December"))
  }
  if (freq == 4) {
    return(c("March", "June", "September", "December"))
  }
  if (freq == 1) {
    return("December")
  }
}
freq_name <- function(freq = freq) {
  if (freq == 12) {
    return("monthly")
  }
  if (freq == 4) {
    return("quarterly")
  }
  if (freq == 1) {
    return("annual")
  }
}

## 0.2 preliminary data cleaning ----
port_market <- read.csv("Allfirms_comp2.csv") %>%
  as.tbl() %>%
  # mutate(E = rollmeanr(E, k = 12, fill = NA_real_)) %>%
  mutate(month = as.yearmon(as.Date(date)),
         vwret = rollsumr(vwret, k = 12/freq, fill = NA)) %>%
  filter(months(month) %in% month_select(freq = freq))
write.csv(x = port_market, file = "market_Allfirms.csv")

ports_all <- read.csv("deciles_sz2.csv") %>%
  as.tbl() %>%
  mutate(month = as.yearmon(as.Date(jdate)), # (, format = "%d/%m/%Y")
         port = as.character(port))
ports <- unique(ports_all$port) # identifiers for portfolios
for (p in ports) {
  ports.dt <- ports_all %>%
    filter(port == p) %>%
    arrange(month) %>%
    # mutate(vwret = rollsumr(vwret, k = 12/freq, fill = NA)) %>%
    filter(months(month) %in% month_select(freq = freq)) # %>%
    # mutate(E = rollmeanr(E, k = 12, fill = NA_real_))
  ports.dt <- ports.dt[, -1]
  write.csv(x = ports.dt, file = paste("port_", p, ".csv", sep = ""))
}

## 0.3 name portfolios and predictors ----
market.names <- list.files(pattern = "market_")
data.names <- paste("port_", "D", 1:10, ".csv", sep = "") # data for portfolios # list.files(pattern = "port_") # 
id.names <-  c("Market", paste("D", 1:10, sep = "")) # set plot names # c("Market", "B", "M", "S") # c("Market", "H", "M", "L") # c("Market", "W", "N", "L")
ratio_names <- c("DP", "PE", "EY", "DY", "Payout") # potential predictors

## 0.4 risk-free rate
RF <- read.csv("Rfree_t30.csv") %>% # record the risk free rate
  as.tbl() %>% # as the average of the bid and ask.
  select(-X) %>%
  mutate(month = as.yearmon(month)) %>%
  filter(months(month) %in% month_select(freq = freq)) %>%
  filter(month >= start.ym)

Notes: Seems that the big-value and small-growth portfolios include less firms comparing the other four characteristic portfolios, around half of them.

Figure 1 - Log Cumulative Index

Log cumulative realised portfolio return components for seven portfolios - the market portfolio and six size and book-to-market equity ratio sorted portfolios. All following figures decmonstrate the monthly realised price-earnings ratio growth (gm), earnings growth (ge), dividend-price (dp) and the portfolio return index (r) with the values in January 1966 as zero for all portfolios.

# TABLE-1. summary statistics ----
TABLE1.uni <- list() # the univariate statistics
TABLE1.cor <- list() # the correlation matrixs

PE.df <- data.frame(month = port_market$month[port_market$month >= start.ym])
EY.df <- data.frame(month = port_market$month[port_market$month >= start.ym])
DP.df <- data.frame(month = port_market$month[port_market$month >= start.ym])

## (1*) summary tables for Summary & Correlations ----
c <- 0
for (id in c(market.names, data.names)) {
  c <- c + 1
  # print(id); print(id.names[c])
  
  ## 1. read the data ----
  data_nyse <- read.csv(id) %>%
    as.tbl() %>%
    mutate(month = as.yearmon(month)) %>%
    filter(month >= start.ym) %>% # start from "Jan 1966"
    select(month, r = vwret, P, E, D) %>%
    mutate(DP = D / P, # these are adjusted by the log transformation
           PE = P / E,
           EP = E / P,
           EY = E / lag(P), # earnings yield
           DY = D / lag(P), # dividend yield
           Payout = D / E) # payout ratios
  
  PE.df <- cbind.data.frame(PE.df, data_nyse$PE)
  EY.df <- cbind.data.frame(EY.df, data_nyse$EY)
  DP.df <- cbind.data.frame(DP.df, data_nyse$DP)
  
  ## 2. return decomposition ----
  data_decompose <- data_nyse %>%
    mutate(r = r, # cts returns = log total returns
           gm = log(PE) - lag(log(PE)), # multiple expansion rate
           ge = log(E) - lag(log(E)), # earnings growth rate
           dp = log(1 + DP/freq)) %>% # only 1/12 of the dividends
    na.omit()
  
  ## 3. summary-Stat ----
  ar1.coef <- function(x) {
    return(as.numeric(lm(x ~ lag(x))$coefficients[2]))
  } # return the function value of the coefficient for the AR(1) model
  
  comp_summary.dt <- data_decompose %>%
    select(gm, ge, dp, r) %>%
    describe() %>%
    mutate(mean = mean * 100,
           sd = sd * 100,
           median = median * 100,
           min = min * 100,
           max = max * 100) %>%
    select(Mean = mean, Median = median, SD = sd, Min = min, Max = max, Skew = skew, Kurt = kurtosis) %>%
    round(digits = 4)
  
  comp_summary.dt$"AR(1)" <- data_decompose %>%
    select(gm, ge, dp, r) %>%
    apply(2, ar1.coef) %>%
    round(digits = 4)
  
  ### Store the summary stat
  # print(paste("Data starts from ", first(data_decompose$month), " and ends in ", last(data_decompose$month), ".", sep = ""))
  TABLE1.uni[[id.names[c]]] <- comp_summary.dt
  
  ## 4. correlations ----
  comp_cor <- data_decompose %>% select(gm, ge, dp, r) %>% cor()
  TABLE1.cor[[id.names[c]]] <- comp_cor
  
  # Figure-1. cumulative realised return components ---- 
  # jpeg(filename = paste("Figure1_", id.names[c], ".jpeg", sep = ""), width = 550, height = 350)
  par(mar = c(2, 4, 2, 1))
  cum_components.ts <- data_decompose %>%
    select(r, gm, ge, dp) %>%
    apply(2, cumsum) %>%
    ts(start = data_decompose$month[1], frequency = freq)
  plot.ts(cum_components.ts, plot.type = "single", lty = 1:4, main = id.names[c], cex.main = 1, 
          xlab = NULL, ylab = "Cumulative Return and Components Indices")
  legend("topleft",
         legend = c("Total return", "Price earnings growth",
                    "Earnings growth", "Dividend price"),
         lty = 1:4,
         cex = 1.0) # text size
  # dev.off()
  par(mar = c(5, 4, 4, 2) + 0.1)
}

write.csv(TABLE1.uni, file = "table_1.uni.csv")
write.csv(TABLE1.cor, file = "table_1.cor.csv")

.

Table 1 - Summary statistics of returns components

The correlations between gm and ge might be a bit too high comparing to Ferreira and Santa-Clara (2011). Need to check the code again.

Need to go back to the construction process of Prof Robert Shiller’s CAPE.

‘kable’ for Table Creation

Table 1 - Summary statistics of returns components
monthly data starts from Jun 1967 and ends in Dec 2019.
Panel A: univariate statistics Panel B: Correlations
Mean Median SD Min Max Skew Kurt AR(1) gm ge dp r
Market
gm 0.02 -0.03 3.12 -15.26 13.28 -0.19 4.42 0.92 1.00 -0.51 -0.03 0.07
ge 0.76 1.11 5.34 -22.01 19.34 -0.50 2.44 0.33 -0.51 1.00 -0.03 0.81
dp 0.28 0.27 0.09 0.09 0.50 0.14 -0.70 0.98 -0.03 -0.03 1.00 -0.03
r 0.94 1.25 4.43 -22.48 16.58 -0.51 1.85 0.05 0.07 0.81 -0.03 1.00
D1
gm -0.30 -0.49 3.57 -17.75 18.44 0.36 5.75 0.88 1.00 -0.41 -0.06 0.09
ge 1.68 1.60 7.19 -33.48 29.66 -0.30 2.72 0.37 -0.41 1.00 0.00 0.86
dp 1.16 0.36 3.04 0.12 22.66 5.09 26.77 1.00 -0.06 0.00 1.00 -0.05
r 1.11 1.23 6.44 -29.96 31.76 -0.09 2.74 0.21 0.09 0.86 -0.05 1.00
D2
gm -0.19 -0.30 3.70 -18.46 16.67 0.13 4.87 0.90 1.00 -0.44 0.04 0.07
ge 1.34 1.35 7.28 -32.46 28.55 -0.35 2.49 0.33 -0.44 1.00 -0.03 0.86
dp 0.85 0.32 1.87 0.11 10.56 3.78 13.16 0.97 0.04 -0.03 1.00 -0.02
r 1.10 1.37 6.46 -30.99 29.83 -0.22 2.25 0.13 0.07 0.86 -0.02 1.00
D3
gm -0.17 -0.13 3.71 -18.91 18.70 0.01 5.21 0.90 1.00 -0.48 0.11 0.05
ge 1.35 1.59 7.02 -29.92 26.02 -0.47 2.41 0.35 -0.48 1.00 -0.09 0.85
dp 0.39 0.28 0.64 0.08 5.15 6.29 40.46 0.92 0.11 -0.09 1.00 -0.03
r 1.16 1.63 6.02 -29.67 26.24 -0.46 1.99 0.11 0.05 0.85 -0.03 1.00
D4
gm -0.16 -0.19 3.62 -19.38 15.75 -0.17 4.78 0.90 1.00 -0.48 0.02 0.06
ge 1.19 1.38 6.74 -30.14 26.78 -0.53 2.77 0.34 -0.48 1.00 -0.07 0.84
dp 0.28 0.23 0.25 0.08 2.29 4.85 28.24 0.96 0.02 -0.07 1.00 -0.05
r 1.06 1.40 5.81 -29.63 25.42 -0.51 2.14 0.10 0.06 0.84 -0.05 1.00
D5
gm -0.17 -0.14 3.54 -19.92 15.94 -0.30 5.48 0.89 1.00 -0.48 -0.03 0.07
ge 1.21 1.27 6.46 -30.38 24.77 -0.48 2.70 0.34 -0.48 1.00 -0.07 0.84
dp 0.30 0.28 0.19 0.07 1.47 2.62 11.55 0.94 -0.03 -0.07 1.00 -0.07
r 1.11 1.49 5.56 -27.94 25.32 -0.48 2.19 0.10 0.07 0.84 -0.07 1.00
D6
gm -0.18 -0.14 3.60 -19.55 15.54 -0.25 4.95 0.90 1.00 -0.52 -0.12 0.06
ge 1.15 1.25 6.25 -26.66 23.96 -0.40 2.31 0.37 -0.52 1.00 0.03 0.81
dp 0.29 0.24 0.18 0.09 1.30 2.82 10.84 0.95 -0.12 0.03 1.00 -0.02
r 1.03 1.33 5.23 -26.08 21.77 -0.53 2.05 0.09 0.06 0.81 -0.02 1.00
D7
gm -0.19 -0.13 3.49 -20.22 13.01 -0.70 5.50 0.89 1.00 -0.51 0.00 0.07
ge 1.13 1.10 6.11 -26.90 23.42 -0.27 2.35 0.36 -0.51 1.00 -0.04 0.82
dp 0.29 0.25 0.20 0.09 1.62 3.26 15.04 0.95 0.00 -0.04 1.00 -0.03
r 1.07 1.31 5.19 -26.21 21.79 -0.48 2.32 0.09 0.07 0.82 -0.03 1.00
D8
gm -0.17 -0.07 3.40 -20.44 14.26 -0.70 6.76 0.89 1.00 -0.53 -0.02 0.06
ge 1.05 1.06 5.89 -24.46 22.63 -0.32 2.40 0.37 -0.53 1.00 -0.02 0.81
dp 0.28 0.27 0.14 0.08 1.02 1.15 2.92 0.95 -0.02 -0.02 1.00 -0.02
r 1.05 1.32 4.96 -23.33 19.21 -0.45 1.71 0.07 0.06 0.81 -0.02 1.00
D9
gm -0.15 -0.18 3.46 -21.10 13.86 -0.71 5.93 0.89 1.00 -0.57 -0.05 0.05
ge 0.91 1.01 5.62 -25.52 22.34 -0.30 2.56 0.41 -0.57 1.00 -0.03 0.79
dp 0.28 0.25 0.12 0.11 0.75 0.85 0.25 0.97 -0.05 -0.03 1.00 -0.04
r 0.96 1.22 4.61 -22.12 18.22 -0.42 1.92 0.08 0.05 0.79 -0.04 1.00
D10
gm -0.09 0.03 3.48 -20.94 12.54 -0.97 5.84 0.89 1.00 -0.59 -0.05 0.07
ge 0.68 0.83 5.37 -22.35 22.24 -0.12 2.65 0.38 -0.59 1.00 -0.02 0.76
dp 0.25 0.23 0.09 0.07 0.54 0.63 -0.24 0.98 -0.05 -0.02 1.00 -0.03
r 0.87 1.13 4.26 -19.68 18.30 -0.36 1.65 0.00 0.07 0.76 -0.03 1.00
Note: Panel A in this table presents mean, median, standard deviation (SD), minimum, maximum, skewness (Skew), kurtosis (kurt) and first-order autocorrelation coefficient of the realised components of stock market returns and six size and book-to-market equity ratio sorted portfolios. These univariate statistics for each portfolios are presented separately. gm is the continuously compounded growth rate in the price-earnings ratio. ge is the continuously compounded growth rate in earnings. dp is the log of one plus the dividend-price ratio. *r* is the portfolio returns. Panel B in this table reports correlation matrices for all seven portfolios. The sample period starts from Feburary 1966 and ends in December 2019.

Figure 3 - Cumulative OOS R-sqaure Difference and Cumulative SSE Difference

The cumulative OOS R-square figures show the out-of-sample cumulative R-square up to each month from predictive regressions with listed predictors and from the sum-of-the-parts (SOP) method for each portfolio. The cumulative SSE difference plots indicates the out-of-sample performance of each model. These are evaluated by the cumulative squared prediction errors of the NULL minus the cumulative squared predictirion error of the ALTERNATIVE. The NULL model is the historical mean model, while the ALTERNATIVE model is either the predictive regression model or the SOP model. An incresae in the line suggests better performance of the ALTERNATIVE model and a decrease suggests that the NULL model is better.

Several points to note in the coding:

  1. The dividend-price ratio (‘DP’ hereafter) is calculated as the log of 1 plus the frequency-adjusted dividend to price ratio, rather than using the annual dividend. As by this return decomposition, the expected amount of dividend payout in each period should be adjusted by the frequency of the data in the analysis. \[ dp_t = \log (1 + \frac{\tilde{D}_t}{P_t}) = \log (1 + \frac{D_t / n}{P_t}) \text{,} \] where \(D_t\) is the annual dividend payment and \(n\) is the data frequency (e.g. \(n = 1\) for annual data and \(n = 12\) for monthly data) and \(\tilde{D}_t\) is the freqency-adjusted dividend payment for period \(t\).

  2. The SOP method by Ferreira and Santa-Clara (2011) decomposes the portfolio return into three components, namely the earnings growth, the prie multiple expansion and the next period dividend-price ratio. Here to generate the SOP prediction, we use the rolling mean of past earnings growth as the expected growth of the next period (denoted as ge1). However, there are other choices, such as recursive means in ge2 and ge3.

  3. critica.value = TRUE is the option whether to use boostrap method to calculate the MSE-F critical values. This is used in function Boot_MSE.F.

  4. The authors should evaluate the significance of the MSE−F statistic by using the theoret- ical distribution derived in McCracken (2007). The bootstrap-based inference (presented in Pages 9-10) can represent a robustness check and moved to an appendix. Further- more, the authors can also include in the main results the related out-of-sample statistic proposed by Clark and West (2007), which follows a standard Normal distribution. Therefore, readjust the Boot_MSE.F function.

  5. Column McCracken in Table 2 (line 604) gives the significance of the out-of-sample \(MSE–F\) statistic of McCracken (2007). \(***\), \(**\), and \(*\) denote significance at the 1%, 5%, and 10% level, respectively. Please refer to the Table 4 on P749 in McCracken (2007) with \(k_2 = 1\) and \(\pi = P/R = \frac{\text{Number of out-of-sample forecasts}}{\text{Number of observations used to form the first forecast}} = 1.6\).

## [1] "market_Allfirms.csv"
## [1] "Market"
## ##------ Thu Jun 30 20:04:26 2022 ------##
## Note: Using an external vector in selections is ambiguous.
## ℹ Use `all_of(actual)` instead of `actual` to silence this message.
## ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This message is displayed once per session.
## Note: Using an external vector in selections is ambiguous.
## ℹ Use `all_of(cond)` instead of `cond` to silence this message.
## ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This message is displayed once per session.
## Note: Using an external vector in selections is ambiguous.
## ℹ Use `all_of(uncond)` instead of `uncond` to silence this message.
## ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This message is displayed once per session.
## Note: Using an external vector in selections is ambiguous.
## ℹ Use `all_of(x)` instead of `x` to silence this message.
## ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This message is displayed once per session.
## [1] "OOS R Squared: 0.0047"
## [1] "MSE-F: 1.8524"
## Note: Using an external vector in selections is ambiguous.
## ℹ Use `all_of(predictor)` instead of `predictor` to silence this message.
## ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This message is displayed once per session.

## [1] "IS R Squared: 0.0106"
## [1] "OOS R Squared: -0.0104"
## [1] "MSE-F: -4.054"
## [1] "IS R Squared: 0.0045"
## [1] "OOS R Squared: -0.0023"
## [1] "MSE-F: -0.9222"
## [1] "IS R Squared: 0.0051"
## [1] "OOS R Squared: -0.0017"
## [1] "MSE-F: -0.658"
## [1] "IS R Squared: 0.012"
## [1] "OOS R Squared: -0.0086"
## [1] "MSE-F: -3.3515"
## [1] "IS R Squared: 1e-04"
## [1] "OOS R Squared: -0.0079"
## [1] "MSE-F: -3.1035"

## [1] "port_D1.csv"
## [1] "D1"
## ##------ Thu Jun 30 20:04:30 2022 ------##
## [1] "OOS R Squared: -0.5669"
## [1] "MSE-F: -141.4607"

## [1] "IS R Squared: 4e-04"
## [1] "OOS R Squared: -0.0291"
## [1] "MSE-F: -11.0156"
## [1] "IS R Squared: 0.0016"
## [1] "OOS R Squared: -0.0151"
## [1] "MSE-F: -5.8017"
## [1] "IS R Squared: 0.0056"
## [1] "OOS R Squared: -0.0178"
## [1] "MSE-F: -6.8044"
## [1] "IS R Squared: 0.0012"
## [1] "OOS R Squared: -0.0441"
## [1] "MSE-F: -16.4754"
## [1] "IS R Squared: 0"
## [1] "OOS R Squared: -0.0086"
## [1] "MSE-F: -3.339"

## [1] "port_D2.csv"
## [1] "D2"
## ##------ Thu Jun 30 20:04:34 2022 ------##
## [1] "OOS R Squared: -0.1811"
## [1] "MSE-F: -59.9607"

## [1] "IS R Squared: 5e-04"
## [1] "OOS R Squared: -0.0305"
## [1] "MSE-F: -11.541"
## [1] "IS R Squared: 0.0028"
## [1] "OOS R Squared: -0.0084"
## [1] "MSE-F: -3.2532"
## [1] "IS R Squared: 0.0049"
## [1] "OOS R Squared: -0.0098"
## [1] "MSE-F: -3.7681"
## [1] "IS R Squared: 9e-04"
## [1] "OOS R Squared: -0.0425"
## [1] "MSE-F: -15.8936"
## [1] "IS R Squared: 0"
## [1] "OOS R Squared: -0.0092"
## [1] "MSE-F: -3.5494"

## [1] "port_D3.csv"
## [1] "D3"
## ##------ Thu Jun 30 20:04:38 2022 ------##
## [1] "OOS R Squared: -0.0274"
## [1] "MSE-F: -10.418"

## [1] "IS R Squared: 0.0016"
## [1] "OOS R Squared: -0.018"
## [1] "MSE-F: -6.9087"
## [1] "IS R Squared: 0.0029"
## [1] "OOS R Squared: -0.0041"
## [1] "MSE-F: -1.5962"
## [1] "IS R Squared: 0.0046"
## [1] "OOS R Squared: -0.0043"
## [1] "MSE-F: -1.6733"
## [1] "IS R Squared: 0.0026"
## [1] "OOS R Squared: -0.0244"
## [1] "MSE-F: -9.286"
## [1] "IS R Squared: 0"
## [1] "OOS R Squared: -0.01"
## [1] "MSE-F: -3.8693"

## [1] "port_D4.csv"
## [1] "D4"
## ##------ Thu Jun 30 20:04:41 2022 ------##
## [1] "OOS R Squared: -0.0085"
## [1] "MSE-F: -3.2908"

## [1] "IS R Squared: 0.0025"
## [1] "OOS R Squared: -0.0173"
## [1] "MSE-F: -6.6337"
## [1] "IS R Squared: 0.0025"
## [1] "OOS R Squared: -0.005"
## [1] "MSE-F: -1.9575"
## [1] "IS R Squared: 0.0041"
## [1] "OOS R Squared: -0.0058"
## [1] "MSE-F: -2.256"
## [1] "IS R Squared: 0.0037"
## [1] "OOS R Squared: -0.0226"
## [1] "MSE-F: -8.6159"
## [1] "IS R Squared: 2e-04"
## [1] "OOS R Squared: -0.0088"
## [1] "MSE-F: -3.3832"

## [1] "port_D5.csv"
## [1] "D5"
## ##------ Thu Jun 30 20:04:44 2022 ------##
## [1] "OOS R Squared: -0.0051"
## [1] "MSE-F: -1.9879"

## [1] "IS R Squared: 0.0035"
## [1] "OOS R Squared: -0.0218"
## [1] "MSE-F: -8.3224"
## [1] "IS R Squared: 0.0029"
## [1] "OOS R Squared: -0.0052"
## [1] "MSE-F: -2"
## [1] "IS R Squared: 0.0046"
## [1] "OOS R Squared: -0.0054"
## [1] "MSE-F: -2.1139"
## [1] "IS R Squared: 0.0048"
## [1] "OOS R Squared: -0.0262"
## [1] "MSE-F: -9.9457"
## [1] "IS R Squared: 6e-04"
## [1] "OOS R Squared: -0.0235"
## [1] "MSE-F: -8.9566"

## [1] "port_D6.csv"
## [1] "D6"
## ##------ Thu Jun 30 20:04:47 2022 ------##
## [1] "OOS R Squared: -9e-04"
## [1] "MSE-F: -0.352"

## [1] "IS R Squared: 0.0075"
## [1] "OOS R Squared: -0.0129"
## [1] "MSE-F: -4.9616"
## [1] "IS R Squared: 0.0031"
## [1] "OOS R Squared: -0.0052"
## [1] "MSE-F: -2.0297"
## [1] "IS R Squared: 0.0043"
## [1] "OOS R Squared: -0.0053"
## [1] "MSE-F: -2.0705"
## [1] "IS R Squared: 0.0091"
## [1] "OOS R Squared: -0.0156"
## [1] "MSE-F: -5.995"
## [1] "IS R Squared: 0.0022"
## [1] "OOS R Squared: -0.0126"
## [1] "MSE-F: -4.8617"

## [1] "port_D7.csv"
## [1] "D7"
## ##------ Thu Jun 30 20:04:50 2022 ------##
## [1] "OOS R Squared: -0.0077"
## [1] "MSE-F: -2.9897"

## [1] "IS R Squared: 0.0029"
## [1] "OOS R Squared: -0.0138"
## [1] "MSE-F: -5.3136"
## [1] "IS R Squared: 0.002"
## [1] "OOS R Squared: -0.0077"
## [1] "MSE-F: -2.9837"
## [1] "IS R Squared: 0.0032"
## [1] "OOS R Squared: -0.0077"
## [1] "MSE-F: -2.9915"
## [1] "IS R Squared: 0.0038"
## [1] "OOS R Squared: -0.0167"
## [1] "MSE-F: -6.4175"
## [1] "IS R Squared: 6e-04"
## [1] "OOS R Squared: -0.0099"
## [1] "MSE-F: -3.8074"

## [1] "port_D8.csv"
## [1] "D8"
## ##------ Thu Jun 30 20:04:53 2022 ------##
## [1] "OOS R Squared: -9e-04"
## [1] "MSE-F: -0.3587"

## [1] "IS R Squared: 0.004"
## [1] "OOS R Squared: -0.0272"
## [1] "MSE-F: -10.341"
## [1] "IS R Squared: 0.0022"
## [1] "OOS R Squared: -0.0072"
## [1] "MSE-F: -2.7867"
## [1] "IS R Squared: 0.003"
## [1] "OOS R Squared: -0.0063"
## [1] "MSE-F: -2.4524"
## [1] "IS R Squared: 0.0049"
## [1] "OOS R Squared: -0.027"
## [1] "MSE-F: -10.242"
## [1] "IS R Squared: 0.0011"
## [1] "OOS R Squared: -0.0056"
## [1] "MSE-F: -2.1667"

## [1] "port_D9.csv"
## [1] "D9"
## ##------ Thu Jun 30 20:04:56 2022 ------##
## [1] "OOS R Squared: 0.0011"
## [1] "MSE-F: 0.4171"

## [1] "IS R Squared: 0.0057"
## [1] "OOS R Squared: -0.017"
## [1] "MSE-F: -6.5235"
## [1] "IS R Squared: 0.0024"
## [1] "OOS R Squared: -0.0063"
## [1] "MSE-F: -2.4603"
## [1] "IS R Squared: 0.0033"
## [1] "OOS R Squared: -0.006"
## [1] "MSE-F: -2.324"
## [1] "IS R Squared: 0.0072"
## [1] "OOS R Squared: -0.0175"
## [1] "MSE-F: -6.7038"
## [1] "IS R Squared: 7e-04"
## [1] "OOS R Squared: -0.0093"
## [1] "MSE-F: -3.5812"

## [1] "port_D10.csv"
## [1] "D10"
## ##------ Thu Jun 30 20:04:59 2022 ------##
## [1] "OOS R Squared: 0.0043"
## [1] "MSE-F: 1.6932"

## [1] "IS R Squared: 0.0075"
## [1] "OOS R Squared: -0.0064"
## [1] "MSE-F: -2.463"
## [1] "IS R Squared: 0.0051"
## [1] "OOS R Squared: -0.0014"
## [1] "MSE-F: -0.5451"
## [1] "IS R Squared: 0.005"
## [1] "OOS R Squared: -0.001"
## [1] "MSE-F: -0.3962"
## [1] "IS R Squared: 0.0074"
## [1] "OOS R Squared: -0.0045"
## [1] "MSE-F: -1.7453"
## [1] "IS R Squared: 2e-04"
## [1] "OOS R Squared: -0.0104"
## [1] "MSE-F: -4.0226"

Table 2 - Forecasts of portfolio returns

This table demonstrates the in-sample and out-of-sample R-squares for the market and six size and book-to-market equity ratio sorted portfolios from predictive regressions and the Sum-of-the-Parts method. IS R-squares are estimated using the whole sample period and the OOS R-squares are calculated compare the forecast error of the model against the historical mean model. The full sample period starts from Feb 1966 to December 2019 and the IS period is set to be 20 years with forecsats beginning in Feb 1986. The MSE-F statistics are calculated to test the hypothesis \(H_0: \text{out-of-sample R-squares} = 0\) vs \(H_1: \text{out-of-sample R-squares} \neq 0\).

Predictors here are all in log terms.

gt(table2.df, rowname_col = "rowname", groupname_col = "portname") %>%
  tab_header(title = "Table 2 - Forecasts of portfolio returns",
             subtitle = paste(freq_name(freq = freq), " data starts from ", first(data_decompose$month), " and ends in ", last(data_decompose$month), ".", sep = "")) %>%
  fmt_number(columns = 1:4, decimals = 6, suffixing = TRUE)
Table 2 - Forecasts of portfolio returns
monthly data starts from Jun 1967 and ends in Dec 2019.
IS_r.squared OOS_r.squared MAE_A MSE_F McCracken
Market
DP 0.010630 −0.010370 0.032611 −4.054026
PE 0.004546 −0.002340 0.032475 −0.922206
EY 0.005117 −0.001669 0.032494 −0.657989
DY 0.011952 −0.008557 0.032621 −3.351461
Payout 0.000117 −0.007919 0.032081 −3.103538
SOP NA 0.004656 0.032173 1.852440 **
D1
DP 0.000439 −0.029066 0.044447 −11.015644
PE 0.001622 −0.015101 0.044070 −5.801714
EY 0.005586 −0.017757 0.044356 −6.804444
DY 0.001184 −0.044108 0.045117 −16.475351
Payout 0.000047 −0.008635 0.043375 −3.339003
SOP NA −0.566887 0.052130 −141.460695
D2
DP 0.000467 −0.030495 0.047819 −11.540992
PE 0.002820 −0.008412 0.047717 −3.253172
EY 0.004925 −0.009756 0.047846 −3.768056
DY 0.000915 −0.042484 0.048067 −15.893613
Payout 0.000012 −0.009185 0.047530 −3.549371
SOP NA −0.181129 0.051526 −59.960696
D3
DP 0.001631 −0.018034 0.044449 −6.908694
PE 0.002874 −0.004110 0.044443 −1.596221
EY 0.004630 −0.004309 0.044556 −1.673328
DY 0.002576 −0.024391 0.044620 −9.285966
Payout 0.000018 −0.010021 0.044259 −3.869314
SOP NA −0.027374 0.044362 −10.417976
D4
DP 0.002540 −0.017304 0.043274 −6.633728
PE 0.002521 −0.005045 0.043230 −1.957534
EY 0.004053 −0.005818 0.043344 −2.256005
DY 0.003746 −0.022591 0.043446 −8.615903
Payout 0.000199 −0.008751 0.042782 −3.383181
SOP NA −0.008488 0.042567 −3.290842
D5
DP 0.003544 −0.021805 0.041423 −8.322424
PE 0.002911 −0.005155 0.041578 −1.999967
EY 0.004553 −0.005450 0.041686 −2.113905
DY 0.004773 −0.026169 0.041533 −9.945682
Payout 0.000599 −0.023506 0.041555 −8.956643
SOP NA −0.005110 0.040826 −1.987885
D6
DP 0.007538 −0.012886 0.038562 −4.961571
PE 0.003074 −0.005232 0.038408 −2.029705
EY 0.004334 −0.005337 0.038426 −2.070501
DY 0.009085 −0.015612 0.038662 −5.995041
Payout 0.002232 −0.012623 0.038319 −4.861703
SOP NA −0.000901 0.038014 −0.351957
D7
DP 0.002852 −0.013813 0.037588 −5.313632
PE 0.001974 −0.007709 0.037404 −2.983673
EY 0.003182 −0.007730 0.037443 −2.991503
DY 0.003813 −0.016731 0.037724 −6.417546
Payout 0.000611 −0.009859 0.037268 −3.807448
SOP NA −0.007705 0.037043 −2.989741
D8
DP 0.004029 −0.027238 0.036928 −10.341038
PE 0.002150 −0.007197 0.036441 −2.786657
EY 0.002967 −0.006328 0.036457 −2.452352
DY 0.004868 −0.026970 0.036983 −10.242040
Payout 0.001061 −0.005587 0.036286 −2.166654
SOP NA −0.000918 0.035986 −0.358676
D9
DP 0.005742 −0.017012 0.034266 −6.523519
PE 0.002431 −0.006349 0.033788 −2.460305
EY 0.003254 −0.005995 0.033831 −2.323957
DY 0.007159 −0.017490 0.034372 −6.703793
Payout 0.000684 −0.009268 0.033579 −3.581160
SOP NA 0.001066 0.033398 0.417082
D10
DP 0.007504 −0.006356 0.031756 −2.463024
PE 0.005105 −0.001400 0.031572 −0.545121
EY 0.004975 −0.001017 0.031554 −0.396172
DY 0.007397 −0.004495 0.031714 −1.745345
Payout 0.000239 −0.010422 0.031440 −4.022555
SOP NA 0.004312 0.031357 1.693190 **

Figure 4 - Monthly return predictions

Here I only present the monthly predictions of the historical mean model, the SOP method and the predictive regressions based on the dividend-price ratio and the earnings-price ratio.

## [1] "market_Allfirms.csv"
## [1] "Market"
## ##------ Thu Jun 30 20:05:08 2022 ------##

## [1] "port_D1.csv"
## [1] "D1"
## ##------ Thu Jun 30 20:05:11 2022 ------##

## [1] "port_D2.csv"
## [1] "D2"
## ##------ Thu Jun 30 20:05:14 2022 ------##

## [1] "port_D3.csv"
## [1] "D3"
## ##------ Thu Jun 30 20:05:17 2022 ------##

## [1] "port_D4.csv"
## [1] "D4"
## ##------ Thu Jun 30 20:05:20 2022 ------##

## [1] "port_D5.csv"
## [1] "D5"
## ##------ Thu Jun 30 20:05:24 2022 ------##

## [1] "port_D6.csv"
## [1] "D6"
## ##------ Thu Jun 30 20:05:27 2022 ------##

## [1] "port_D7.csv"
## [1] "D7"
## ##------ Thu Jun 30 20:05:30 2022 ------##

## [1] "port_D8.csv"
## [1] "D8"
## ##------ Thu Jun 30 20:05:33 2022 ------##

## [1] "port_D9.csv"
## [1] "D9"
## ##------ Thu Jun 30 20:05:36 2022 ------##

## [1] "port_D10.csv"
## [1] "D10"
## ##------ Thu Jun 30 20:05:38 2022 ------##

Figure 5 - Trading Performance (with no trading restrictions)

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in xy.coords(x = matrix(rep.int(tx, k), ncol = k), y = x, log = log, :
## 769 y values <= 0 omitted from logarithmic plot

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in xy.coords(x = matrix(rep.int(tx, k), ncol = k), y = x, log = log, :
## 776 y values <= 0 omitted from logarithmic plot

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in xy.coords(x = matrix(rep.int(tx, k), ncol = k), y = x, log = log, :
## 776 y values <= 0 omitted from logarithmic plot

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in xy.coords(x = matrix(rep.int(tx, k), ncol = k), y = x, log = log, :
## 776 y values <= 0 omitted from logarithmic plot

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in xy.coords(x = matrix(rep.int(tx, k), ncol = k), y = x, log = log, :
## 776 y values <= 0 omitted from logarithmic plot

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in window.default(x, ...): 'start' value not changed

## Warning in xy.coords(x = matrix(rep.int(tx, k), ncol = k), y = x, log = log, :
## 776 y values <= 0 omitted from logarithmic plot

Table 3 - Certaint equivalent gains

Trading Strategies: certaint equivalent gains

This table shows the out-of-sample portfolio choice results at monthly frequencies from predictive regressions and the SOP method. The trading strategy for each portfolio is designed by optimally allocating funds between the risk-free asset and the corresponding risky portfolio. The certainty equivalent return is \(\overline{rp} - \frac{1}{2} \gamma \hat{\sigma}_{rp}^{2}\) with a risk-aversion coefficient \(\gamma = 3\). The annualised certainty equivalent gain (in percentage) is the monthly certainty equivalent gain multiplied by the corresponding frequency (e.g. 12 for monthly data).

dt <- table3.df %>%
  filter(rowname %in% c(ratio_names, "sop_simple")) %>%
  select(CEGs_annualised, rowname, portname)

as.data.frame(matrix(dt$CEGs_annualised, byrow = F, nrow = length(ratio_names) + 1, ncol = length(id.names))) %>%
  `colnames<-`(unique(dt$portname)) %>%
  mutate(Variable = unique(dt$rowname)) %>%
  # round(digits = 4) %>%
  as.tbl() %>%
  select(Variable, unique(dt$portname)) %>%
  gt(rowname_col = "Variable") %>%
  tab_header(title = "Table 3 - Trading Strategies: certainty equivalent gains",
             subtitle = paste(str_to_title(freq_name(freq = freq)), " data starts from ", first(data_decompose$month) + 20, " and ends in ", last(data_decompose$month), ".", sep = "")) %>%
  fmt_percent(columns = 2:(length(id.names)+1), decimals = 2)
Table 3 - Trading Strategies: certainty equivalent gains
Monthly data starts from Jun 1987 and ends in Dec 2019.
Market D1 D2 D3 D4 D5 D6 D7 D8 D9 D10
sop_simple 1.10% −129.71% −19.06% −204.31% 204.90% −93.68% −2.58% −363.83% −0.09% −0.75% −10.97%
DP −2.85% −47.58% 8.84% 1,378.43% −420.52% −933.88% −30.98% −765.73% −4.56% −0.31% 26.16%
PE 0.84% 28.95% 9.86% 1,759.94% 786.85% 42.07% 2.32% 462.07% 1.23% 2.21% 24.46%
EY 0.98% 5.39% 7.42% 1,801.02% 698.73% 42.13% 2.50% 436.62% 1.46% 2.30% 21.00%
DY −2.36% −177.29% 5.32% 1,420.84% −1,599.26% −1,333.02% −36.62% −1,211.53% −5.12% −0.61% 26.77%
Payout −4.13% 11.27% −6.17% −1,860.46% 103.03% −405.48% −10.61% 75.00% −1.74% −6.78% −35.95%

Table 4 - Sharpe ratio Gains

Trading Strategies: Sharpe ratio Gains

This table presents the Sharpe ratio of the out-of-sample performance of trading strategies, allocating funds between risk-free and risky assets for each portfolio. The annualised Sharpe ratio is generated by multipling the monthly Sharpe ratio by square root of the corresponding frequency (e.g. \(\sqrt{12}\) for monthly data).

dt <- table4.df %>%
  filter(rowname %in% c(ratio_names, "sop_simple")) %>%
  select(SRG_annualised, rowname, portname)

as.data.frame(matrix(dt$SRG_annualised, byrow = F, nrow = length(ratio_names) + 1, ncol = length(id.names))) %>%
  `colnames<-`(unique(dt$portname)) %>%
  mutate(Variable = unique(dt$rowname)) %>%
  # round(digits = 4) %>%
  as.tbl() %>%
  select(Variable, unique(dt$portname)) %>%
  gt(rowname_col = "Variable") %>%
  tab_header(title = "Table 4 - Trading Strategies: Sharpe ratio gains", 
             subtitle = paste(str_to_title(freq_name(freq = freq)), " data starts from ", first(data_decompose$month) + 20, " and ends in ", last(data_decompose$month), ".", sep = "")) %>%
  fmt_number(columns = 2:(length(id.names)+1), decimals = 4) 
Table 4 - Trading Strategies: Sharpe ratio gains
Monthly data starts from Jun 1987 and ends in Dec 2019.
Market D1 D2 D3 D4 D5 D6 D7 D8 D9 D10
sop_simple 0.0524 0.1183 0.2084 0.0123 0.0139 −0.0795 −0.0356 −0.0092 −0.0043 −0.0311 −0.0873
DP −0.1667 0.3491 0.2782 0.0260 0.3530 −0.1944 −0.4374 −0.0373 −0.1938 −0.0383 0.3242
PE 0.0674 0.3669 0.2792 0.1240 0.4891 0.5023 0.1201 0.4385 0.0734 0.1482 0.2115
EY 0.0827 0.3611 0.2410 0.6750 0.3967 0.5056 0.1295 0.4108 0.0926 0.1597 0.1044
DY −0.1441 0.3265 0.2599 0.0357 0.3469 −0.1993 −0.4555 −0.0423 −0.2066 −0.0582 0.3512
Payout −0.1547 0.0506 −0.1361 −0.0142 −0.0021 −0.1651 −0.2732 0.0031 −0.0866 −0.2317 −0.1107

Figure 6 - Sensitivity of Certainty Equivalent Gains relative to Risk-Aversion level

This figure presents the out-of-sample portfolio choice results at monthly frequency from bivariate predictive regressions and the SOP method with different levels of risk-aversion. To show that our previous results hold with respect to investors with different levels of risk aversion, we evaluate the changes in certainty equivalent gains with respect to the changes in the level of risk-aversion. The results of the trading strategy reported here are without trading restrictions (as in Table 5), allocating funds between the risk-free asset and the risky equity portfolio. The portfolio choice results are evaluated in the certainty equivalent return with relative risk-aversion coefficient \(\gamma\), with ${\(0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5\)}$. Risky equity portfolios include the market portfolio and six size and book-to-market equity sorted portfolios, BH, BM, BL, SH, SM and SL. The annualised certainty equivalent gain is the monthly certainty equivalent gain multiplied by twelve. The sample period is from February 1966 to December 2019 and the out-of-sample period starts in March 1986.

## [1] "Market"
## ##------ Thu Jun 30 20:05:52 2022 ------##
## [1] "D1"
## ##------ Thu Jun 30 20:05:52 2022 ------##
## [1] "D2"
## ##------ Thu Jun 30 20:05:52 2022 ------##
## [1] "D3"
## ##------ Thu Jun 30 20:05:52 2022 ------##
## [1] "D4"
## ##------ Thu Jun 30 20:05:53 2022 ------##
## [1] "D5"
## ##------ Thu Jun 30 20:05:53 2022 ------##
## [1] "D6"
## ##------ Thu Jun 30 20:05:53 2022 ------##
## [1] "D7"
## ##------ Thu Jun 30 20:05:53 2022 ------##
## [1] "D8"
## ##------ Thu Jun 30 20:05:53 2022 ------##
## [1] "D9"
## ##------ Thu Jun 30 20:05:54 2022 ------##
## [1] "D10"
## ##------ Thu Jun 30 20:05:54 2022 ------##
## Warning: Removed 45 rows containing missing values (geom_point).
## Warning: Removed 45 row(s) containing missing values (geom_path).

Table 5 - MSPE-adjusted Statistic

MSPE-adjusted Statistic

This table presents the MSEP-adjusted Statistics, evaluating the statistical significance of the out-of-sample R-squared statistics of each model in the corresponding portfolio.

See Rapach et al., (2010) and Clark and West (2007) for the detailed procedure.

table5.df <- data.frame()
for (port in names(TABLE5)) {
  pt <- TABLE5[[port]]
  pt$rowname <- rownames(pt)
  pt$portname <- port
  colnames(pt)[4] <- "star"
  table5.df <- rbind.data.frame(table5.df, pt)
}

table5.output <- gt(table5.df, rowname_col = "rowname", groupname_col = "portname") %>%
  fmt_percent(columns = vars(OOS_r.squared, mspe_pvalue), decimals = 2) %>%
  fmt_number(columns = vars(mspe_t), decimals = 4) %>%
  tab_header(title = "Table 5 - MSPE-adjusted Statistic",
             subtitle = paste(str_to_title(freq_name(freq = freq)), " data starts from ", first(data_decompose$month), " and ends in ", last(data_decompose$month), ".", sep = ""))

table5.output
Table 5 - MSPE-adjusted Statistic
Monthly data starts from Jun 1967 and ends in Dec 2019.
OOS_r.squared mspe_t mspe_pvalue star
Market
DP −1.04% 0.7486 22.73%
PE −0.23% 0.6344 26.31%
EY −0.17% 0.7180 23.66%
DY −0.86% 0.9945 16.03%
Payout −0.79% −1.4743 92.94%
SOP 0.47% 1.2007 11.53%
D1
DP −2.91% 0.4125 34.01%
PE −1.51% 0.3564 36.09%
EY −1.78% 1.0408 14.93%
DY −4.41% 0.8909 18.68%
Payout −0.86% −1.0410 85.07%
SOP −56.69% −0.7927 78.58%
D2
DP −3.05% −1.2378 89.17%
PE −0.84% 0.5064 30.64%
EY −0.98% 0.8232 20.54%
DY −4.25% −0.8405 79.94%
Payout −0.92% −0.3355 63.13%
SOP −18.11% 0.0442 48.24%
D3
DP −1.80% −0.2117 58.38%
PE −0.41% 0.5892 27.80%
EY −0.43% 0.8518 19.74%
DY −2.44% 0.2176 41.39%
Payout −1.00% −1.4016 91.91%
SOP −2.74% −0.0240 50.96%
D4
DP −1.73% −0.3366 63.17%
PE −0.50% 0.4347 33.20%
EY −0.58% 0.6585 25.53%
DY −2.26% −0.0916 53.65%
Payout −0.88% −0.4451 67.18%
SOP −0.85% −0.4245 66.43%
D5
DP −2.18% −0.6679 74.77%
PE −0.52% 0.4901 31.22%
EY −0.54% 0.7207 23.58%
DY −2.62% −0.4381 66.92%
Payout −2.35% −1.4232 92.23%
SOP −0.51% −0.2072 58.20%
D6
DP −1.29% 0.2118 41.62%
PE −0.52% 0.4753 31.74%
EY −0.53% 0.6554 25.63%
DY −1.56% 0.4103 34.09%
Payout −1.26% −0.9724 83.43%
SOP −0.09% 0.6373 26.21%
D7
DP −1.38% −0.2826 61.12%
PE −0.77% 0.1336 44.69%
EY −0.77% 0.3369 36.82%
DY −1.67% 0.0232 49.07%
Payout −0.99% −0.8438 80.03%
SOP −0.77% −0.4797 68.42%
D8
DP −2.72% 0.5399 29.48%
PE −0.72% 0.1439 44.28%
EY −0.63% 0.3159 37.61%
DY −2.70% 0.6972 24.30%
Payout −0.56% −0.0519 52.07%
SOP −0.09% 0.2767 39.11%
D9
DP −1.70% 0.7368 23.08%
PE −0.63% 0.2897 38.61%
EY −0.60% 0.4304 33.36%
DY −1.75% 0.9876 16.20%
Payout −0.93% −1.5827 94.28%
SOP 0.11% 0.6881 24.59%
D10
DP −0.64% 0.4025 34.38%
PE −0.14% 0.7255 23.43%
EY −0.10% 0.7078 23.97%
DY −0.45% 0.4899 31.23%
Payout −1.04% −0.3614 64.10%
SOP 0.43% 1.0774 14.10%